Some New Bounds on the Generalization Error of Combined Classifiers
نویسندگان
چکیده
In this paper we develop the method of bounding the generalization error of a classifier in terms of its margin distribution which was introduced in the recent papers of Bartlett and Schapire, Freund, Bartlett and Lee. The theory of Gaussian and empirical processes allow us to prove the margin type inequalities for the most general functional classes, the complexity of the class being measured via the so called Gaussian complexity functions. As a simple application of our results, we obtain the bounds of Schapire, Freund, Bartlett and Lee for the generalization error of boosting. We also substantially improve the results of Bartlett on bounding the generalization error of neural networks in terms of l1 norms of the weights of neurons. Furthermore, under additional assumptions on the complexity of the class of hypotheses we provide some tighter bounds, which in the case of boosting improve the results of Schapire, Freund, Bartlett and Lee.
منابع مشابه
Bounding the Generalization Error of Neural Networks and Combined Classifiers
Recently, several authors developed a new approach to bounding the generalization error of complex classi-ers (of large or even innnite VC-dimension) obtained by combining simpler classiiers. The new bounds are in terms of the distributions of the margin of combined classiiers and they provide some theoretical explanation of generalization performance of large neu-We obtained new probabilistic ...
متن کاملEmpirical Margin Distributions and Bounding the Generalization Error of Combined Classifiers
We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The bounds are in terms of the empirical distribution of the margin of the combined classifier. They are based on the methods ...
متن کاملGeneralization error bounds for classifiers trained with interdependent data
In this paper we propose a general framework to study the generalization properties of binary classifiers trained with data which may be dependent, but are deterministically generated upon a sample of independent examples. It provides generalization bounds for binary classification and some cases of ranking problems, and clarifies the relationship between these learning tasks.
متن کاملThe University of Chicago Algorithmic Stability and Ensemble-based Learning a Dissertation Submitted to the Faculty of the Division of the Physical Sciences in Candidacy for the Degree of Doctor of Philosophy Department of Computer Science by Samuel Kutin
We explore two themes in formal learning theory. We begin with a detailed, general study of the relationship between the generalization error and stability of learning algorithms. We then examine ensemble-based learning from the points of view of stability, decorrelation, and threshold complexity. A central problem of learning theory is bounding generalization error. Most such bounds have been ...
متن کاملGeneralization Error Bounds for Threshold Decision Lists
In this paper we consider the generalization accuracy of classification methods based on the iterative use of linear classifiers. The resulting classifiers, which we call threshold decision lists act as follows. Some points of the data set to be classified are given a particular classification according to a linear threshold function (or hyperplane). These are then removed from consideration, a...
متن کامل